Scaling up the Automatic Statistician: Scalable Structure Discovery using Gaussian Processes
نویسندگان
چکیده
Automating statistical modelling is a challenging problem that has far-reaching implications for artificial intelligence. The Automatic Statistician employs a kernel search algorithm to provide a first step in this direction for regression problems. However this does not scale due to its O(N) running time for the model selection. This is undesirable not only because the average size of data sets is growing fast, but also because there is potentially more information in bigger data, implying a greater need for more expressive models that can discover finer structure. We propose Scalable Kernel Composition (SKC), a scalable kernel search algorithm, to encompass big data within the boundaries of automated statistical modelling.
منابع مشابه
Scalable Structure Discovery in Regression using Gaussian Processes
Automatic Bayesian Covariance Discovery (ABCD) in Lloyd et al. (2014) provides a framework for automating statistical modelling as well as exploratory data analysis for regression problems. However ABCD does not scale due to its O(N) running time for the kernel search. This is undesirable not only because the average size of data sets is growing fast, but also because there is potentially more ...
متن کاملCovariance Kernels for Fast Automatic Pattern Discovery and Extrapolation with Gaussian Processes
Truly intelligent systems are capable of pattern discovery and extrapolation without human intervention. Bayesian nonparametric models, which can uniquely represent expressive prior information and detailed inductive biases, provide a distinct opportunity to develop intelligent systems, with applications in essentially any learning and prediction task. Gaussian processes are rich distributions ...
متن کاملGPatt: Fast Multidimensional Pattern Extrapolation with Gaussian Processes
Gaussian processes are typically used for smoothing and interpolation on small datasets. We introduce a new Bayesian nonparametric framework – GPatt – enabling automatic pattern extrapolation with Gaussian processes on large multidimensional datasets. GPatt unifies and extends highly expressive kernels and fast exact inference techniques. Without human intervention – no hand crafting of kernel ...
متن کاملThe Automatic Statistician: A Relational Perspective
Gaussian Processes (GPs) provide a general and analytically tractable way of modeling complex time-varying, nonparametric functions. The Automatic Bayesian Covariance Discovery (ABCD) system constructs natural-language description of time-series data by treating unknown timeseries data nonparametrically using GP with a composite covariance kernel function. Unfortunately, learning a composite co...
متن کاملAutomatic Construction and Natural-Language Description of Nonparametric Regression Models
This paper presents the beginnings of an automatic statistician, focusing on regression problems. Our system explores an open-ended space of statistical models to discover a good explanation of a data set, and then produces a detailed report with figures and naturallanguage text. Our approach treats unknown regression functions nonparametrically using Gaussian processes, which has two important...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1706.02524 شماره
صفحات -
تاریخ انتشار 2017